Skip to main content
Version: 12.10.0

Azure Kubernetes Service (AKS) Overview

Introduction

Azure Kubernetes Service (AKS) is a managed container orchestration service provided by Microsoft Azure. It simplifies the deployment, management, and operations of Kubernetes, allowing developers to focus on building their applications without worrying about the underlying infrastructure.

What is a Kubernetes Cluster?

A Kubernetes cluster is a set of node machines for running containerized applications. A cluster has at least one worker node and a master node:

Master Node:

The master node manages the state of the cluster, scheduling applications, and handling scaling requirements.

Worker Nodes:

Worker nodes are the machines that run the applications and workloads.

Maintaining an AKS Cluster

Maintaining an AKS cluster involves several key activities:

Monitoring: Keep track of the performance and health of your cluster and applications using tools like Azure Monitor and Azure Log Analytics. Upgrading: Regularly update your AKS cluster to the latest Kubernetes version to ensure you have the latest features and security fixes. Backup and Recovery: Implement backup strategies for your cluster's data and state to enable recovery in case of failures. Security: Ensure that your cluster is secure by managing access using Azure Active Directory and implementing network policies.

Scaling Applications in AKS

Scaling applications in AKS can be done manually or automatically:

Manual Scaling: You can manually scale the number of pods in a deployment or the number of nodes in your cluster. Horizontal Pod Autoscaler (HPA): HPA automatically adjusts the number of pods in a deployment based on CPU utilization or other select metrics. Vertical Pod Autoscaler (VPA): VPA automatically adjusts the CPU and memory reservations for your pods. Cluster Autoscaler: Automatically adjusts the number of nodes in your cluster based on the needs of your workloads and the constraints you define.

Best Practices for Scaling

Define Resource Requests and Limits: Properly define the resource requests and limits for your containers to ensure efficient scheduling and stability. Use Quality of Service (QoS) Classes: Kubernetes uses QoS classes to make decisions about scheduling and evicting pods. Implement Descheduler: For better resource utilization and balancing, consider using a descheduler to move pods from overutilized nodes to underutilized ones. Monitor and Adjust: Continuously monitor the performance of your applications and adjust your scaling strategies accordingly.

We do not recommend VPA and prefer HPA over it : seems sufficient enough to hold the load. not need any alteration.

In Kubernetes, the pod spec is immutable. This means that the pod spec can't be updated in place. To update or change the pod resource request, VPA needs to evict the pod and re-create it. This will disrupt current workload. VPA is not yet ready for JVM-based workloads. This shortcoming is due to its limited visibility into memory usage for Java virtual machine workloads VPA is not aware of Kubernetes cluster infrastructure variables such as node size in terms of memory and CPU. Therefore, it doesn't know whether a recommended pod size will fit your node. This means that the resource requests recommendation may be too large to fit any node, and therefore pods may go to a pending state because the resource request can’t be met VPA won't work with HPA using the same CPU and memory metrics because it would cause a race condition. Suppose HPA and VPA both use CPU and memory metrics for scaling decisions. HPA will try to scale out (horizontally) based on CPU and memory, while at the same time, VPA will try to scale the pods up (vertically). Therefore if you need to use both HPA and VPA together, you must configure HPA to use a custom metric such as web requests VPA doesn’t consider network and I/O. We can still have a VPA there ,with update turned off VPA assists in adding or removing CPU and memory resources but its inherent limitations make it too risky to use in a production environment

updatePolicy: updateMode: "Off"

This will just get the recommendations, but not auto-apply them. Once the configuration is applied, get the VPA recommendations by using the

kubectl describe vpa nginx- deployment-vpa

command. VPA analysis from SAPM The current TPS analysis

alt text

Split: 6,2% BW und 93,8% Plus

• Screencalls(Both):736k=>Avg.0,3Transactionspersecond o MAX (per Hour, both): 3.373 Screen calls in an hour => ~ 1 Transaction per second o MAX (per Minute, both): 133 Screen calls in a minute => ~2 Transactions per second 1-2 TPS is the current on APIs and expecting to be max 10TPS .

With current TPS requirement , from a single node minimum replica is sufficient enough to handle the load. Means min replica can be 1 And on 80% capacity we can start scaling up horizontally and do not expect anything more than 3 replicas are needed right now.

The current setting is well suited for the conditions : Recommendation on resourcing other than the critical services :

cpu: "300" Performance testing for services in Dev for 400 Transaction per hour and 2 TPS with 1 pod set , is under the guidelines of 1 sec response time on average and stand good for the pod sizing Recommendation on resourcing for critical services

cpu: "700 cpu: "1Gi"

Scaling:

Min 1 to max 3 on autoscaling , with target percentages to 80

autoscaling:

enabled: false

minReplicas: 1

maxReplicas: 3

targetCPUUtilizationPercentage: 80

targetMemoryUtilizationPercentage: 80

Resourcing recommendations for services : (highlighted are critical and need bit extra )

 Service NameCPU ResRAM ResCPU LimRAM lim
1navida-pro-be-consent-service100300300700
2navida-pro-be-content-service,100300300700
3navida-pro-be-datastore-service60051210001024
4navida_pro_be_yuble_bff_service,100300300700
5navida-pro-be-doctor-search-bff-service100300300700
6navida_pro_be_auth_login_service100300300700
7navida-pro-be-user-profile-bff-service3005127001024
8navida_pro_be_magazine_bff_service100300300700
9navida-pro-be-motivation-bff-service100300300700
10navida-pro-be-bonus-bff-service,100300300700
11navida_pro_be_vorsorgekompass_bff_service, -100300300700
12 navida-pro-be-symptomcheck-bff-service .100300300700
13 navida-pro-be-user-account-delete-service100300300700
14navida-pro-be-user-common-service100300300700
15navida_pro_be_notification_service,100300300700
16navida_pro_be_challenges_ds_service,100300300700
17navida-pro-be-video-consultation-service100300300700
18navida_pro_be_mental_challenges_bff_service.100300300700
19navida_pro_be_bawu_bff_service,100300300700
20navida_pro_be_challenges_bff_service,100300300700
21navida_pro_be_challenges_foodie_bff_service100300300700
22navida_pro_be_cron_job_service60051210001024
23navida-pro-be-cms-service3005127001024
24navida-pro-be-challenges-support-tool100300300700
25navida-pro-be-consent-support-tool100300300700
26navida-pro-be-template-engine100300300700
27navida-pro-kong60051210001024